Vượt qua Tìm kiếm Cơ bản: Giải quyết Hạn chế của Sự Tương tự Ngữ nghĩa

Vượt ra ngoài Sự Tương tự

Hiện tượng "Vấn đề 80%"xảy ra khi tìm kiếm ngữ nghĩa cơ bản hoạt động tốt với các truy vấn đơn giản nhưng lại thất bại trong các trường hợp đặc biệt. Khi chúng ta chỉ tìm kiếm theo độ tương đồng, kho vector thường trả về các đoạn văn có độ tương đồng số học cao nhất. Tuy nhiên, nếu những đoạn này gần như giống nhau, mô hình ngôn ngữ lớn (LLM) sẽ nhận được thông tin trùng lặp, làm lãng phí không gian ngữ cảnh hạn chế và bỏ sót các góc nhìn tổng quan hơn.

Các trụ cột Tìm kiếm Nâng cao

Tối đa hóa Tính Liên quan Cận biên (MMR):Thay vì chỉ chọn các mục tương tự nhất, MMR cân bằng giữa tính liên quan và sự đa dạng để tránh dư thừa thông tin. $MMR = \text{argmax}_{d \in R \setminus S} [\lambda \cdot \text{sim}(d, q) - (1 - \lambda) \cdot \max_{s \in S} \text{sim}(d, s)]$
Tự hỏi – Truy xuất:Sử dụng mô hình ngôn ngữ lớn (LLM) để chuyển đổi ngôn ngữ tự nhiên thành các bộ lọc dữ liệu cấu trúc (ví dụ: lọc theo "Bài giảng 3" hoặc "Nguồn: PDF").
Nén Ngữ cảnh:Thu nhỏ các tài liệu đã truy xuất để chỉ lấy các đoạn "có giá trị dinh dưỡng cao" phù hợp với câu hỏi, giúp tiết kiệm token.

Bẫy Dư thừa Thông tin

Việc cung cấp ba phiên bản của cùng một đoạn văn cho mô hình ngôn ngữ lớn (LLM) không làm nó thông minh hơn—chỉ khiến prompt trở nên đắt đỏ hơn. Đa dạng là yếu tố then chốt để tạo ra một ngữ cảnh "có giá trị dinh dưỡng cao".

TERMINALbash — 80x24

> Ready. Click "Run" to execute.

Knowledge Check

You want your system to answer "What did the instructor say about probability in the third lecture?" specifically. Which tool allows the LLM to automatically apply a filter for { "source": "lecture3.pdf" }?

ConversationBufferMemory

Self-Querying Retriever

Contextual Compression

MapReduce Chain

Challenge: The Token Limit Dilemma

Apply advanced retrieval strategies to solve a real-world constraint.

You are building a RAG system for a legal firm. The documents retrieved are 50 pages long, but only 2 sentences per page are actually relevant to the user's specific query. The standard "Stuff" chain is throwing an OutOfTokens error because the context window is overflowing with irrelevant text.

Step 1

Identify the core problem and select the appropriate advanced retrieval tool to solve it without losing specific nuances.

Problem: The context window limit is being exceeded by "low-nutrient" text surrounding the relevant facts.

Tool Selection:ContextualCompressionRetriever

Step 2

What specific component must you use in conjunction with this retriever to "squeeze" the documents?

Solution: Use an LLMChainExtractor as the base for your compressor. This will process the retrieved documents and extract only the snippets relevant to the query, passing a much smaller, highly concentrated context to the final prompt.